feat(gcloud): add Google Cloud credential provider#310
Conversation
- Wire --project, --key-file, --impersonate-service-account, --scopes flags in grant.go CLI (previously documented but not implemented) - Fix EndpointProvider detection: use string match for AWS to avoid gcloud credentials being parsed as AWS config - Add /computeMetadata/ path check for 169.254.169.254 routing to prevent AWS IMDS conflict - Return defensive copy from Scopes() to prevent slice aliasing - Use exec.CommandContext in detectProject() for cancellation support - Extract DefaultEmail constant, handle zero-expiry tokens in metadata - Fix stale GCE_METADATA_HOST references in doc comments - Fix project detection order description in guide
📦 Previous Review (superseded by newer review)Claude finished @dpup's task in 5m 2s —— View job Code Review
Overall this is a solid implementation that follows the AWS provider pattern faithfully. The architecture (host-side ADC → token cache → metadata emulation via proxy) is correct, the tests are thorough, and the daemon persistence/restore path is wired in consistently with the existing AWS provider. Five issues worth addressing, roughly in order of priority:
|
| // For 169.254.169.254, also check the path to avoid intercepting AWS IMDS | ||
| // requests when both gcloud and aws grants are active on the same run. | ||
| if r.Host == "metadata.google.internal" || | ||
| (strings.HasPrefix(r.Host, "169.254.169.254") && strings.HasPrefix(r.URL.Path, "/computeMetadata/")) { |
There was a problem hiding this comment.
strings.HasPrefix(r.Host, "169.254.169.254") is too broad — it matches any host that starts with that prefix, including 169.254.169.254.evil.com. Use an exact match plus a port-suffix variant:
| (strings.HasPrefix(r.Host, "169.254.169.254") && strings.HasPrefix(r.URL.Path, "/computeMetadata/")) { | |
| (( r.Host == "169.254.169.254" || strings.HasPrefix(r.Host, "169.254.169.254:")) && strings.HasPrefix(r.URL.Path, "/computeMetadata/")) { |
Low practical impact (the path check and Metadata-Flavor header guard both fire), but it's semantically wrong.
| "GOOGLE_CLOUD_PROJECT="+r.GCloudCredentialProvider.ProjectID(), | ||
| "CLOUDSDK_CORE_PROJECT="+r.GCloudCredentialProvider.ProjectID(), | ||
| // Prevent gcloud from trying to read local credential files. | ||
| "CLOUDSDK_AUTH_DISABLE_CREDENTIALS_FILE=true", |
There was a problem hiding this comment.
CLOUDSDK_AUTH_DISABLE_CREDENTIALS_FILE doesn't appear in the gcloud Python SDK source or official environment variable docs. The closest real ones are CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE and GOOGLE_APPLICATION_CREDENTIALS.
In practice this is harmless — containers don't have ~/.config/gcloud/ mounted, so the gcloud CLI falls back to the metadata server regardless. But if the variable is a no-op it should be removed (the behaviour is already correct without it), or replaced with the correct variable name.
|
|
||
| ui.Infof("Using Google Cloud project: %s", cfg.ProjectID) | ||
|
|
||
| cred := &provider.Credential{ |
There was a problem hiding this comment.
The cfg.Email field is never populated during the grant flow, so MetaKeyEmail is never written. The metadata endpoint will always serve default@moat.local for the email, even when impersonating a real service account.
gcloud auth list inside the container will show default@moat.local rather than the impersonated SA address, which could be confusing. At a minimum, when --impersonate-service-account is set, the provided email is known:
| cred := &provider.Credential{ | |
| cred := &provider.Credential{ | |
| Provider: "gcloud", | |
| Token: "", // no static token; tokens are minted on demand | |
| CreatedAt: time.Now(), | |
| Metadata: map[string]string{ | |
| MetaKeyProject: cfg.ProjectID, | |
| MetaKeyScopes: strings.Join(cfg.Scopes, ","), | |
| }, | |
| } | |
| if cfg.ImpersonateSA != "" { | |
| cred.Metadata[MetaKeyImpersonate] = cfg.ImpersonateSA | |
| cred.Metadata[MetaKeyEmail] = cfg.ImpersonateSA // SA email is known | |
| } |
| } | ||
|
|
||
| // classifyError returns a user-friendly error message for common token errors. | ||
| func classifyError(err error) string { |
There was a problem hiding this comment.
classifyError matches against error message strings from the Google SDK. These strings are not part of any stable API and could silently break with a dependency update, turning the helpful messages into the generic fallback.
It's not easy to use errors.As/errors.Is here since the Google SDK wraps errors opaquely. Consider at least testing the error strings with strings.Contains(strings.ToLower(msg), ...) to be case-insensitive, and/or adding a comment that these need to be audited when golang.org/x/oauth2 is bumped.
| return nil, ctx.Err() | ||
| } | ||
|
|
||
| p.mu.Lock() |
There was a problem hiding this comment.
The mutex is held across the p.source.Token() call (line 103), which can involve a network round-trip to refresh OAuth tokens. This serialises all concurrent metadata requests for the duration of the refresh. For a metadata emulation server that can receive burst requests from both the gcloud CLI and Go client libraries simultaneously, this could cause noticeable latency spikes.
Consider a singleflight group so that concurrent callers that race past the cache check all wait on the single in-flight refresh instead of queueing behind the mutex:
var g singleflight.Group
func (p *CredentialProvider) GetToken(ctx context.Context) (*oauth2.Token, error) {
if ctx.Err() != nil {
return nil, ctx.Err()
}
p.mu.RLock()
if p.cached != nil && p.cached.Valid() && time.Now().Add(credentialRefreshBuffer).Before(p.cached.Expiry) {
tok := p.cached
p.mu.RUnlock()
return tok, nil
}
p.mu.RUnlock()
v, err, _ := p.g.Do("token", func() (any, error) {
tok, err := p.source.Token()
if err == nil {
p.mu.Lock()
p.cached = tok
p.mu.Unlock()
}
return tok, err
})
...
}This is a low-priority improvement for the current use case, but worth noting.
- Use exact host match for 169.254.169.254 instead of prefix match - Remove CLOUDSDK_AUTH_DISABLE_CREDENTIALS_FILE (not a real gcloud env var) - Set email from impersonated SA when --impersonate-service-account is used - Add fragility comment to classifyError string matching
📦 Previous Review (superseded by newer review)Claude finished @dpup's task in 3m 45s —— View job Code Review
All five issues from the previous review have been addressed. The proxy host-matching fix, removal of the undocumented env var, email population for impersonation, and the Three new issues below, one of which is a real functional bug. Bug: relative key file path breaks on daemon restore
if v, ok := ctx.Value(ctxKeyKeyFile).(string); ok && v != "" {
cfg.KeyFile = v
}The path is stored as-is in the credential metadata and later passed to Fix — resolve to absolute before storing: if v, ok := ctx.Value(ctxKeyKeyFile).(string); ok && v != "" {
if abs, err := filepath.Abs(v); err == nil {
cfg.KeyFile = abs
} else {
cfg.KeyFile = v
}
}Minor: service-accounts listing duplicates
|
gcloud CLI's GCE detection bypasses HTTP_PROXY (uses ReadNoProxy with
ProxyHandler({})) and checks GCE_METADATA_ROOT instead of GCE_METADATA_HOST.
This meant the metadata emulator was unreachable during GCE detection, causing
gcloud to conclude it was not running on GCE.
Three fixes:
1. Direct metadata handler in proxy: handle requests arriving without
Proxy-Authorization by matching Metadata-Flavor header + /computeMetadata/
path. The daemon wires a resolver that finds the gcloud handler from the
registry without requiring a proxy auth token.
2. Container env vars: set GCE_METADATA_ROOT (for gcloud CLI) and
GCE_METADATA_HOST (for google-auth client libraries) pointing directly
at the proxy, plus CLOUDSDK_CORE_ACCOUNT for the active account.
3. Endpoint handler: normalize email-based service account paths to
"default" and handle ?recursive=true queries that gcloud uses to
fetch all account details in one request.
Prevent cross-run credential leakage in FindGCloudHandler by returning nil when multiple gcloud runs are registered. Add table-driven tests for normalizeSAPath covering email-based identifiers, passthrough cases, and recursive query integration. Use gcloud.DefaultEmail constant instead of inline string in manager.go.
Previously FindGCloudHandler refused to serve when multiple runs had gcloud configured, even if they shared the same credentials. Now it compares the credential profile across runs — same profile means identical credentials, so any handler is equivalent. Different profiles still return nil to prevent cross-run credential leakage. Adds Profile field to GCloudConfig and populates it from credential.ActiveProfile during run creation.
📦 Previous Review (superseded by newer review)Claude finished @dpup's task in 4m 23s —— View job Code Review
Good progress since the last review. The proxy host-matching fix, removal of the undocumented env var, impersonation email population, and Still open: relative key file path breaks on daemon registration and restore
The key file path is stored as-is from the CLI flag and ends up in There are two failure points, not one: Fix in if v, ok := ctx.Value(ctxKeyKeyFile).(string); ok && v != "" {
if abs, err := filepath.Abs(v); err == nil {
cfg.KeyFile = abs
} else {
cfg.KeyFile = v
}
}Still open: service-accounts listing duplicates
|
- Resolve key file path to absolute before storing in credential metadata so relative paths work when the daemon reads them from a different working directory - Fix token cache for zero-expiry tokens: treat Expiry.IsZero() as indefinitely valid instead of bypassing cache on every request - Only list email-based service account in metadata listing when a real email is known; omit the synthetic default@moat.local to avoid duplicate entries in gcloud auth list - Add --scopes documentation to the gcloud guide
📦 Previous Review (superseded by newer review)Claude finished @dpup's task in 3m 42s —— View job Code Review
All issues from the previous three review rounds have been addressed. The implementation is solid and ready to merge with one minor issue flagged below. Resolved from previous rounds
Still deferred (acknowledged)Write mutex held across network I/O ( Minor:
|
|
Addressed all four items from the latest review in aee6aaf:
Mutex-across-network-IO remains deferred as before (low priority for this use case). |
|
Claude finished @dpup's task in 3m 50s —— View job Code Review
All nine issues from the previous four rounds have been addressed. The implementation is clean and ready to merge. Resolved from previous rounds ✓
Still deferred (acknowledged)Write mutex held across network I/O ( Minor: numeric project ID "0" not listed in Limitations
The endpoint at The comment in code already says "not available; return 0" — it just needs a corresponding entry in the guide's Limitations section:
Correctness check: all other paths confirmed
|
Summary
moat grant gcloudprovider that authenticates gcloud CLI and Google client libraries inside containers via GCE metadata server emulation--project,--key-file,--impersonate-service-account, and--scopesCLI flagsGCE_METADATA_ROOTArchitecture
Follows the AWS provider pattern: host-side
CredentialProviderwith token caching →EndpointHandlerserving GCE metadata routes → per-run handler attached via daemonRunContext→ proxy intercepts metadata requests viaHTTP_PROXY.Key differences from header-injection providers (GitHub, Claude):
Direct metadata handling
gcloud CLI and Python's
google-authlibrary use barehttp.clientfor GCE metadata detection, bypassingHTTP_PROXY. The proxy handles these direct requests by:Metadata-Flavor: Googleheader on non-proxied requestsContainer environment
Sets
GCE_METADATA_HOST,GCE_METADATA_ROOT,CLOUDSDK_CORE_ACCOUNT,GOOGLE_CLOUD_PROJECT, andCLOUDSDK_CORE_PROJECTto point the gcloud CLI and Google client libraries at the proxy's metadata emulator.Followup
Endpoint provider abstraction to decouple AWS/gcloud-specific logic from core daemon/proxy packages:
docs/plans/2026-04-14-endpoint-provider-abstraction-design.mdTest plan
go build ./...passesmake test-unitpasses (pre-existing env-dependent failures in container/deps only)make lintclean for all changed filesmoat grant gcloud --project <p>+moat run --grant gcloud+gcloud projects listinside container